Enhanced Text-to-Image Synthesis with Self-Supervision

نویسندگان

چکیده

The task of Text-to-Image synthesis is a difficult challenge, especially when dealing with low-data regimes, where the number training samples limited. In order to address this Self-Supervision Generative Adversarial Networks (SS-TiGAN) has been proposed. method employs bi-level architecture, which allows for use self-supervision increase by generating rotation variants. This, in turn, maximizes diversity model representation and enables exploration high-level object information more detailed image construction. addition self-supervision, SS-TiGAN also investigates various techniques stability issues that arise Networks. By implementing these techniques, proposed achieved new state-of-the-art performance on two benchmark datasets, Oxford-102 CUB. These results demonstrate effectiveness synthesizing high-quality, realistic images from text descriptions under regimes.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computer Vision Report: Text to Image Synthesis

Generative adversarial networks have been shown to generate very realistic images by learning through a min-max game. Furthermore, these models are known to model image spaces more easily when conditioned on class labels. In this work, we consider conditioning on fine-grained textual descriptions, thus also enabling us to produce realistic images that correspond to the input text description. A...

متن کامل

Generative Adversarial Text to Image Synthesis

Automatic synthesis of realistic images from text would be interesting and useful, but current AI systems are still far from this goal. However, in recent years generic and powerful recurrent neural network architectures have been developed to learn discriminative text feature representations. Meanwhile, deep convolutional generative adversarial networks (GANs) have begun to generate highly com...

متن کامل

Inferring Semantic Layout for Hierarchical Text-to-Image Synthesis

We propose a novel hierarchical approach for text-to-image synthesis by inferring semantic layout. Instead oflearning a direct mapping from text to image, our algorithmdecomposes the generation process into multiple steps, inwhich it first constructs a semantic layout from the text bythe layout generator and converts the layout to an image bythe image generator. The ...

متن کامل

Text to visual synthesis with appearance models

This paper presents a new method named text to visual synthesis with appearance models (TEVISAM) for generating videorealistic talking heads. In a first step, the system learns a person-specific facial appearance model (PSFAM) automatically. PSFAM allows modeling all facial components (e.g. eyes, mouth, etc) independently and it will be used to animate the face from the input text dynamically. ...

متن کامل

Tagging an Unfamiliar Text With Minimal Human Supervision

In this paper, we will discuss a method for tagging an unannotated text corpus whose structure is completely unknown, with a little bit of help from an informant. Starting from scratch, automated and semi-automated methods are employed to build a part of speech tagger for the text. There are three steps to building the tagger: uncovering a set of part of speech tags, discovering for each word i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2023

ISSN: ['2169-3536']

DOI: https://doi.org/10.1109/access.2023.3268869